Using untranscribed training data to improve performance
نویسندگان
چکیده
This paper explores techniques for utilizing untranscribed training data pools to increase the available training data for automatic speech recognition systems. It has been well established that current speech recognition technology, especially in Large Vocabulary Conversational Speech Recognition (LVCSR), is largely language independent, and that the dominant factor with regards to performance on a certain language is the amount of available training data ([4]). The paper addresses this need for increased training data by presenting ways to use untranscribed acoustic data to increase the training data size and thus improve speech recognition.
منابع مشابه
Semi-supervised GMM and DNN acoustic model training with multi-system combination and confidence re-calibration
We present our study on semi-supervised Gaussian mixture model (GMM) hidden Markov model (HMM) and deep neural network (DNN) HMM acoustic model training. We analyze the impact of transcription quality and data sampling approach on the performance of the resulting model, and propose a multisystem combination and confidence re-calibration approach to improve the transcription inference and data s...
متن کاملImproving Speaker-Independent Lipreading with Domain-Adversarial Training
We present a Lipreading system, i.e. a speech recognition system using only visual features, which uses domain-adversarial training for speaker independence. Domain-adversarial training is integrated into the optimization of a lipreader based on a stack of feedforward and LSTM (Long Short-Term Memory) recurrent neural networks, yielding an end-to-end trainable system which only requires a very ...
متن کاملGaussian Map based Acoustic Model Adaptation Using Untranscribed Data for Speech Recognition in Severely Adverse Environments
This study proposes an acoustic model adaptation scheme to improve speech recognition in severely adverse environments utilizing untranscribed data. In the proposed method, a clean GMM is estimated from clean training data, and a noisecorrupted GMM is obtained by MAP adaptation over the adaptation data. The Gaussian component of the adapted HMMs is obtained using the transform of the most simil...
متن کاملActive and unsupervised learning for automatic speech recognition
State-of-the-art speech recognition systems are trained using human transcriptions of speech utterances. In this paper, we describe a method to combine active and unsupervised learning for automatic speech recognition (ASR). The goal is to minimize the human supervision for training acoustic and language models and to maximize the performance given the transcribed and untranscribed data. Active...
متن کاملActive and Unsupervised Learning for A
State-of-the-art speech recognition systems are trained using human transcriptions of speech utterances. In this paper, we describe a method to combine active and unsupervised learning for automatic speech recognition (ASR). The goal is to minimize the human supervision for training acoustic and language models and to maximize the performance given the transcribed and untranscribed data. Active...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998